Skip to content

Conversation

@lxy-9602
Copy link
Collaborator

@lxy-9602 lxy-9602 commented Jan 6, 2026

Purpose

Linked issue: #5

  1. Added vector search support for DataEvolutionBatchScan and GlobalIndexEvaluator.
  2. Renamed topk to vector_search.

Tests

GlobalIndexTest.TestDataEvolutionBatchScanWithVectorSearch

API and Format

Add VectorSearch to GlobalIndexEvaluator and ScanContext.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds vector similarity search support to the global index scanning infrastructure and renames all "topk" terminology to "vector_search" for better clarity and consistency.

Key changes:

  • Introduces new VectorSearch struct to encapsulate vector similarity search parameters including field name, limit, query vector, pre-filter, and predicate
  • Updates GlobalIndexEvaluator to support both predicate filtering and vector search, with predicates serving as pre-filters for vector search
  • Renames classes and methods from TopK/topk to VectorSearch/vector_search throughout the codebase

Reviewed changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
include/paimon/predicate/vector_search.h New header defining VectorSearch struct with field name, limit, query vector, pre-filter function, and optional predicate
include/paimon/global_index/global_index_reader.h Renamed VisitTopK to VisitVectorSearch in the GlobalIndexReader interface
include/paimon/global_index/global_index_result.h Renamed TopKGlobalIndexResult to VectorSearchGlobalIndexResult and updated related iterator classes
include/paimon/global_index/bitmap_vector_search_global_index_result.h Renamed BitmapTopKGlobalIndexResult to BitmapVectorSearchGlobalIndexResult
include/paimon/scan_context.h Added VectorSearch support to ScanFilter and ScanContextBuilder
include/paimon/table/source/table_read.h Added documentation note about thread-safety of BatchReaders
src/paimon/core/global_index/global_index_evaluator.h Updated Evaluate method to accept both predicate and vector_search parameters with documentation on their interaction
src/paimon/core/global_index/global_index_evaluator_impl.h Added new methods for vector search evaluation and helper methods to get index readers
src/paimon/core/global_index/global_index_evaluator_impl.cpp Implemented vector search evaluation logic including conversion of predicate results to pre-filters
src/paimon/core/global_index/global_index_scan_impl.cpp Updated parallel scan to accept and process vector search parameters
src/paimon/core/table/source/data_evolution_batch_scan.cpp Added vector search support to batch scanning with proper result handling
src/paimon/core/operation/scan_context.cpp Implemented vector search getter/setter in ScanFilter and ScanContextBuilder
src/paimon/global_index/lumina/lumina_global_index.cpp Updated lumina index reader to use VectorSearch struct instead of individual parameters
src/paimon/common/global_index/bitmap_vector_search_global_index_result.cpp Implementation of renamed result class with vector search semantics
test/inte/global_index_test.cpp Comprehensive test coverage for vector search functionality including new test TestDataEvolutionBatchScanWithVectorSearch
Multiple test files Updated all test code to use new vector_search terminology and API
Comments suppressed due to low confidence (1)

src/paimon/common/global_index/bitmap_vector_search_global_index_result.cpp:93

  • The comment "current and other have has no intersection" contains a duplicate word "has". It should read "have no intersection" or "has no intersection".

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@lszskye lszskye merged commit 014cbe9 into alibaba:main Jan 7, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants